The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts
نویسندگان
چکیده
Specific guanine-rich sequence motifs in the human genome have considerable potential to form four-stranded structures known as G-quadruplexes or G4 DNA. The enrichment of these motifs in key chromosomal regions has suggested a functional role for the G-quadruplex structure in genomic regulation. In this work, we have examined the spectrum of nucleotide substitutions in G4 motifs, and related this spectrum to G4 prevalence. Data collected from the large repository of human SNPs indicates that the core feature of G-quadruplex motifs, 5'-GGG-3', exhibits specific mutational patterns that preserve the potential for G4 formation. In particular, we find a genome-wide pattern in which sites that disrupt the guanine triplets are more conserved and less polymorphic than their neutral counterparts. This also holds when considering non-CpG sites only. However, the low level of polymorphisms in guanine tracts is not only confined to G4 motifs. A complete mapping of DNA three-mers at guanine polymorphisms indicated that short guanine tracts are the most under-represented sequence context at polymorphic sites. Furthermore, we provide evidence for a strand bias upstream of human genes. Here, a significantly lower rate of G4-disruptive SNPs on the non-template strand supports a higher relative influence of G4 formation on this strand during transcription.
منابع مشابه
Conserved elements with potential to form polymorphic G-quadruplex structures in the first intron of human genes
To understand how potential for G-quadruplex formation might influence regulation of gene expression, we examined the 2 kb spanning the transcription start sites (TSS) of the 18 217 human RefSeq genes, distinguishing contributions of template and nontemplate strands. Regions both upstream and downstream of the TSS are G-rich, but the downstream region displays a clear bias toward G-richness on ...
متن کاملG-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch
Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonic...
متن کاملG-Quadruplex DNA Sequences Are Evolutionarily Conserved and Associated with Distinct Genomic Features in Saccharomyces cerevisiae
G-quadruplex DNA is a four-stranded DNA structure formed by non-Watson-Crick base pairing between stacked sets of four guanines. Many possible functions have been proposed for this structure, but its in vivo role in the cell is still largely unresolved. We carried out a genome-wide survey of the evolutionary conservation of regions with the potential to form G-quadruplex DNA structures (G4 DNA ...
متن کاملZinc-finger transcription factors are associated with guanine quadruplex motifs in human, chimpanzee, mouse and rat promoters genome-wide
Function of non-B DNA structures are poorly understood though several bioinformatics studies predict role of the G-quadruplex DNA structure in transcription. Earlier, using transcriptome profiling we found evidence of widespread G-quadruplex-mediated gene regulation. Herein, we asked whether potential G-quadruplex (PG4) motifs associate with transcription factors (TF). This was analyzed using 2...
متن کاملSugar-modified G-quadruplexes: effects of LNA-, 2′F-RNA– and 2′F-ANA-guanosine chemistries on G-quadruplex structure and stability
G-quadruplex-forming oligonucleotides containing modified nucleotide chemistries have demonstrated promising pharmaceutical potential. In this work, we systematically investigate the effects of sugar-modified guanosines on the structure and stability of a (4+0) parallel and a (3+1) hybrid G-quadruplex using over 60 modified sequences containing a single-position substitution of 2'-O-4'-C-methyl...
متن کامل